在本文中,我们提出了一种自适应方法,用于通过正交过程进行聚类和可视化数据。从使用扩散地图框架代表的数据点开始,该方法通过应用受Gromov-Wasserstein距离启发的反馈机制自适应地增加了群集的正交性。这种机制迭代地增加了光谱差距,并完善了数据的正交性,以实现具有高特异性的聚类。通过使用扩散地图框架并使用过渡概率表示数据点之间的关系,该方法相对于基础距离,数据中的噪声和随机初始化都是可靠的。我们证明该方法将全球收敛到某些参数值的唯一固定点。我们还提出了一种相关方法,其中要求马尔可夫进程中的过渡概率是双重随机的,在这种情况下,该方法对非convex优化问题产生了最小化器。我们将该方法应用于来自生物药物制造的冷冻电子显微镜图像数据,在那里我们可以确认与治疗功效有关的生物学相关见解。我们考虑了一个具有基因包装形态变化的示例,并确认该方法会产生与人类专家分类一致的生物学意义的聚类结果。
translated by 谷歌翻译
多边缘最优运输(MOT)是最佳运输到多个边缘的概括。最佳运输已经进化为许多机器学习应用中的重要工具,其多边缘扩展开辟了解决机器学习领域的新挑战。然而,MOT的使用很大程度上受到其计算复杂性的影响,其在边缘数量中呈指数级尺度。幸运的是,在许多应用程序中,例如重心或插值问题,成本函数遵守结构,最近被利用以开发有效的计算方法。在这项工作中,我们可以为这些方法推导计算范围。以$ N $积分支持$ M $ M $ M $ Myginal发行版,我们提供了$ \ Mathcal {\ Tilde O}(D(g)Mn ^ 2 \ epsilon ^ { - 2})$ \ \ epsilon $-Accuracy当问题与直径为D(g)$的树相关联时。对于Wassersein的特殊情况,这对应于星形树,我们的界限与现有的复杂性对齐。
translated by 谷歌翻译
在视频中自动识别有害内容是一项重要的任务,具有广泛的应用程序。但是,缺乏可用的专业标签开放数据集。在这项工作中,介绍了由专业人士注释的电影预告片的3589个视频片段的开放数据集。对数据集进行了分析,从而揭示了剪辑和拖车级别注释之间的关系。视听模型在数据集上进行了培训,并对进行的建模选择进行了深入研究。结果表明,通过结合视觉和音频方式,大规模视频识别数据集的预训练以及类平衡采样来大大提高性能。最后,使用歧视探测研究了受过训练的模型的偏差。Vidharm公开可用,并提供更多详细信息,请访问:https://vidharm.github.io。
translated by 谷歌翻译
Data-driven models such as neural networks are being applied more and more to safety-critical applications, such as the modeling and control of cyber-physical systems. Despite the flexibility of the approach, there are still concerns about the safety of these models in this context, as well as the need for large amounts of potentially expensive data. In particular, when long-term predictions are needed or frequent measurements are not available, the open-loop stability of the model becomes important. However, it is difficult to make such guarantees for complex black-box models such as neural networks, and prior work has shown that model stability is indeed an issue. In this work, we consider an aluminum extraction process where measurements of the internal state of the reactor are time-consuming and expensive. We model the process using neural networks and investigate the role of including skip connections in the network architecture as well as using l1 regularization to induce sparse connection weights. We demonstrate that these measures can greatly improve both the accuracy and the stability of the models for datasets of varying sizes.
translated by 谷歌翻译
In this paper we take the first steps in studying a new approach to synthesis of efficient communication schemes in multi-agent systems, trained via reinforcement learning. We combine symbolic methods with machine learning, in what is referred to as a neuro-symbolic system. The agents are not restricted to only use initial primitives: reinforcement learning is interleaved with steps to extend the current language with novel higher-level concepts, allowing generalisation and more informative communication via shorter messages. We demonstrate that this approach allow agents to converge more quickly on a small collaborative construction task.
translated by 谷歌翻译
We test the performance of GAN models for lip-synchronization. For this, we reimplement LipGAN in Pytorch, train it on the dataset GRID and compare it to our own variation, L1WGAN-GP, adapted to the LipGAN architecture and also trained on GRID.
translated by 谷歌翻译
High content imaging assays can capture rich phenotypic response data for large sets of compound treatments, aiding in the characterization and discovery of novel drugs. However, extracting representative features from high content images that can capture subtle nuances in phenotypes remains challenging. The lack of high-quality labels makes it difficult to achieve satisfactory results with supervised deep learning. Self-Supervised learning methods, which learn from automatically generated labels has shown great success on natural images, offer an attractive alternative also to microscopy images. However, we find that self-supervised learning techniques underperform on high content imaging assays. One challenge is the undesirable domain shifts present in the data known as batch effects, which may be caused by biological noise or uncontrolled experimental conditions. To this end, we introduce Cross-Domain Consistency Learning (CDCL), a novel approach that is able to learn in the presence of batch effects. CDCL enforces the learning of biological similarities while disregarding undesirable batch-specific signals, which leads to more useful and versatile representations. These features are organised according to their morphological changes and are more useful for downstream tasks - such as distinguishing treatments and mode of action.
translated by 谷歌翻译
Objective: Imbalances of the electrolyte concentration levels in the body can lead to catastrophic consequences, but accurate and accessible measurements could improve patient outcomes. While blood tests provide accurate measurements, they are invasive and the laboratory analysis can be slow or inaccessible. In contrast, an electrocardiogram (ECG) is a widely adopted tool which is quick and simple to acquire. However, the problem of estimating continuous electrolyte concentrations directly from ECGs is not well-studied. We therefore investigate if regression methods can be used for accurate ECG-based prediction of electrolyte concentrations. Methods: We explore the use of deep neural networks (DNNs) for this task. We analyze the regression performance across four electrolytes, utilizing a novel dataset containing over 290000 ECGs. For improved understanding, we also study the full spectrum from continuous predictions to binary classification of extreme concentration levels. To enhance clinical usefulness, we finally extend to a probabilistic regression approach and evaluate different uncertainty estimates. Results: We find that the performance varies significantly between different electrolytes, which is clinically justified in the interplay of electrolytes and their manifestation in the ECG. We also compare the regression accuracy with that of traditional machine learning models, demonstrating superior performance of DNNs. Conclusion: Discretization can lead to good classification performance, but does not help solve the original problem of predicting continuous concentration levels. While probabilistic regression demonstrates potential practical usefulness, the uncertainty estimates are not particularly well-calibrated. Significance: Our study is a first step towards accurate and reliable ECG-based prediction of electrolyte concentration levels.
translated by 谷歌翻译
Workplace injuries are common in today's society due to a lack of adequately worn safety equipment. A system that only admits appropriately equipped personnel can be created to improve working conditions. The goal is thus to develop a system that will improve workers' safety using a camera that will detect the usage of Personal Protective Equipment (PPE). To this end, we collected and labeled appropriate data from several public sources, which have been used to train and evaluate several models based on the popular YOLOv4 object detector. Our focus, driven by a collaborating industrial partner, is to implement our system into an entry control point where workers must present themselves to obtain access to a restricted area. Combined with facial identity recognition, the system would ensure that only authorized people wearing appropriate equipment are granted access. A novelty of this work is that we increase the number of classes to five objects (hardhat, safety vest, safety gloves, safety glasses, and hearing protection), whereas most existing works only focus on one or two classes, usually hardhats or vests. The AI model developed provides good detection accuracy at a distance of 3 and 5 meters in the collaborative environment where we aim at operating (mAP of 99/89%, respectively). The small size of some objects or the potential occlusion by body parts have been identified as potential factors that are detrimental to accuracy, which we have counteracted via data augmentation and cropping of the body before applying PPE detection.
translated by 谷歌翻译
Storytelling and narrative are fundamental to human experience, intertwined with our social and cultural engagement. As such, researchers have long attempted to create systems that can generate stories automatically. In recent years, powered by deep learning and massive data resources, automatic story generation has shown significant advances. However, considerable challenges, like the need for global coherence in generated stories, still hamper generative models from reaching the same storytelling ability as human narrators. To tackle these challenges, many studies seek to inject structured knowledge into the generation process, which is referred to as structure knowledge-enhanced story generation. Incorporating external knowledge can enhance the logical coherence among story events, achieve better knowledge grounding, and alleviate over-generalization and repetition problems in stories. This survey provides the latest and comprehensive review of this research field: (i) we present a systematical taxonomy regarding how existing methods integrate structured knowledge into story generation; (ii) we summarize involved story corpora, structured knowledge datasets, and evaluation metrics; (iii) we give multidimensional insights into the challenges of knowledge-enhanced story generation and cast light on promising directions for future study.
translated by 谷歌翻译